Overview

Brought to you by YData

Dataset statistics

Number of variables6
Number of observations21013536
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory961.9 MiB
Average record size in memory48.0 B

Variable types

Numeric4
Unsupported1
DateTime1

Alerts

RatingID is uniformly distributed Uniform
RatingID has unique values Unique
Vintage is an unsupported type, check if it needs cleaning or further analysis Unsupported

Reproduction

Analysis started2025-05-09 11:43:05.558394
Analysis finished2025-05-09 11:51:00.703021
Duration7 minutes and 55.14 seconds
Software versionydata-profiling vv4.16.1
Download configurationconfig.json

Variables

RatingID
Real number (ℝ)

Uniform  Unique 

Distinct21013536
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10506768
Minimum1
Maximum21013536
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size160.3 MiB
2025-05-09T13:51:01.046191image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1050677.8
Q15253384.8
median10506768
Q315760152
95-th percentile19962859
Maximum21013536
Range21013535
Interquartile range (IQR)10506768

Descriptive statistics

Standard deviation6066085.5
Coefficient of variation (CV)0.57735026
Kurtosis-1.2
Mean10506768
Median Absolute Deviation (MAD)5253384
Skewness-2.1117456 × 10-16
Sum2.2078436 × 1014
Variance3.6797393 × 1013
MonotonicityStrictly increasing
2025-05-09T13:51:01.194623image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
21013520 1
 
< 0.1%
21013519 1
 
< 0.1%
21013518 1
 
< 0.1%
21013517 1
 
< 0.1%
21013516 1
 
< 0.1%
21013515 1
 
< 0.1%
21013514 1
 
< 0.1%
21013513 1
 
< 0.1%
21013512 1
 
< 0.1%
21013511 1
 
< 0.1%
Other values (21013526) 21013526
> 99.9%
ValueCountFrequency (%)
1 1
< 0.1%
2 1
< 0.1%
3 1
< 0.1%
4 1
< 0.1%
5 1
< 0.1%
6 1
< 0.1%
7 1
< 0.1%
8 1
< 0.1%
9 1
< 0.1%
10 1
< 0.1%
ValueCountFrequency (%)
21013536 1
< 0.1%
21013535 1
< 0.1%
21013534 1
< 0.1%
21013533 1
< 0.1%
21013532 1
< 0.1%
21013531 1
< 0.1%
21013530 1
< 0.1%
21013529 1
< 0.1%
21013528 1
< 0.1%
21013527 1
< 0.1%

UserID
Real number (ℝ)

Distinct1056079
Distinct (%)5.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1433100.8
Minimum1000001
Maximum2063390
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size160.3 MiB
2025-05-09T13:51:01.362220image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum1000001
5-th percentile1030895
Q11180506
median1357576
Q31692543
95-th percentile1988865
Maximum2063390
Range1063389
Interquartile range (IQR)512037

Descriptive statistics

Standard deviation308756.57
Coefficient of variation (CV)0.21544651
Kurtosis-1.0436298
Mean1433100.8
Median Absolute Deviation (MAD)227567
Skewness0.47199114
Sum3.0114516 × 1013
Variance9.5330619 × 1010
MonotonicityNot monotonic
2025-05-09T13:51:01.515767image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1084433 2986
 
< 0.1%
1034989 2979
 
< 0.1%
1070878 2613
 
< 0.1%
1048267 2597
 
< 0.1%
1160536 2392
 
< 0.1%
1000272 2309
 
< 0.1%
1006657 2307
 
< 0.1%
1019610 2291
 
< 0.1%
1107271 2225
 
< 0.1%
1035125 2101
 
< 0.1%
Other values (1056069) 20988736
99.9%
ValueCountFrequency (%)
1000001 97
 
< 0.1%
1000002 29
 
< 0.1%
1000003 22
 
< 0.1%
1000004 759
< 0.1%
1000005 108
 
< 0.1%
1000006 11
 
< 0.1%
1000007 9
 
< 0.1%
1000008 150
 
< 0.1%
1000009 46
 
< 0.1%
1000010 436
< 0.1%
ValueCountFrequency (%)
2063390 9
 
< 0.1%
2063389 37
< 0.1%
2063388 12
 
< 0.1%
2063387 5
 
< 0.1%
2063386 7
 
< 0.1%
2063385 10
 
< 0.1%
2063384 6
 
< 0.1%
2063383 53
< 0.1%
2063382 8
 
< 0.1%
2063381 5
 
< 0.1%

WineID
Real number (ℝ)

Distinct100646
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean147287.3
Minimum100001
Maximum200795
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size160.3 MiB
2025-05-09T13:51:01.695525image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum100001
5-th percentile102025
Q1118159
median155313
Q3168895
95-th percentile186393
Maximum200795
Range100794
Interquartile range (IQR)50736

Descriptive statistics

Standard deviation27655.253
Coefficient of variation (CV)0.187764
Kurtosis-1.2073695
Mean147287.3
Median Absolute Deviation (MAD)19419
Skewness-0.16127642
Sum3.095027 × 1012
Variance7.6481303 × 108
MonotonicityNot monotonic
2025-05-09T13:51:01.864592image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
155289 27415
 
0.1%
179010 23626
 
0.1%
179011 21216
 
0.1%
111391 20913
 
0.1%
167418 20817
 
0.1%
162494 20456
 
0.1%
167419 18823
 
0.1%
135825 18748
 
0.1%
179012 18575
 
0.1%
167420 17759
 
0.1%
Other values (100636) 20805188
99.0%
ValueCountFrequency (%)
100001 2625
< 0.1%
100002 10
 
< 0.1%
100003 62
 
< 0.1%
100004 110
 
< 0.1%
100005 72
 
< 0.1%
100006 1837
< 0.1%
100007 43
 
< 0.1%
100008 424
 
< 0.1%
100009 1971
< 0.1%
100010 1504
< 0.1%
ValueCountFrequency (%)
200795 5
< 0.1%
200794 5
< 0.1%
200793 5
< 0.1%
200792 5
< 0.1%
200791 5
< 0.1%
200790 5
< 0.1%
200789 5
< 0.1%
200788 5
< 0.1%
200787 5
< 0.1%
200786 5
< 0.1%

Vintage
Unsupported

Rejected  Unsupported 

Missing0
Missing (%)0.0%
Memory size160.3 MiB

Rating
Real number (ℝ)

Distinct9
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.8848798
Minimum1
Maximum5
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size160.3 MiB
2025-05-09T13:51:01.979029image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2.5
Q13.5
median4
Q34.5
95-th percentile5
Maximum5
Range4
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.73757873
Coefficient of variation (CV)0.18985883
Kurtosis1.2489558
Mean3.8848798
Median Absolute Deviation (MAD)0.5
Skewness-0.71678179
Sum81635060
Variance0.54402238
MonotonicityNot monotonic
2025-05-09T13:51:02.096228image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
4 8301655
39.5%
3.5 3389567
16.1%
5 2950264
 
14.0%
3 2755661
 
13.1%
4.5 2505818
 
11.9%
2.5 468045
 
2.2%
2 425593
 
2.0%
1 152452
 
0.7%
1.5 64481
 
0.3%
ValueCountFrequency (%)
1 152452
 
0.7%
1.5 64481
 
0.3%
2 425593
 
2.0%
2.5 468045
 
2.2%
3 2755661
 
13.1%
3.5 3389567
16.1%
4 8301655
39.5%
4.5 2505818
 
11.9%
5 2950264
 
14.0%
ValueCountFrequency (%)
5 2950264
 
14.0%
4.5 2505818
 
11.9%
4 8301655
39.5%
3.5 3389567
16.1%
3 2755661
 
13.1%
2.5 468045
 
2.2%
2 425593
 
2.0%
1.5 64481
 
0.3%
1 152452
 
0.7%

Date
Date

Distinct19746027
Distinct (%)94.0%
Missing0
Missing (%)0.0%
Memory size160.3 MiB
Minimum2012-01-03 08:20:53
Maximum2021-12-31 23:59:56
Invalid dates0
Invalid dates (%)0.0%
2025-05-09T13:51:02.210662image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-05-09T13:51:02.360672image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Interactions

2025-05-09T13:49:42.553939image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-05-09T13:48:51.565627image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-05-09T13:49:07.992334image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-05-09T13:49:25.172002image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-05-09T13:49:46.931487image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-05-09T13:48:55.652568image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-05-09T13:49:12.161921image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-05-09T13:49:29.384765image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-05-09T13:49:51.216576image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-05-09T13:48:59.747408image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-05-09T13:49:16.555377image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-05-09T13:49:33.577202image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-05-09T13:49:55.291682image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-05-09T13:49:03.847243image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-05-09T13:49:20.853223image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-05-09T13:49:38.100430image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Correlations

2025-05-09T13:51:02.480399image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
RatingRatingIDUserIDWineID
Rating1.000-0.076-0.128-0.017
RatingID-0.0761.000-0.0390.032
UserID-0.128-0.0391.0000.029
WineID-0.0170.0320.0291.000

Missing values

2025-05-09T13:49:57.636270image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
A simple visualization of nullity by column.
2025-05-09T13:50:06.312048image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

RatingIDUserIDWineIDVintageRatingDate
01160444113610319504.02019-10-14 11:20:52
12129148313610319505.02019-11-28 03:36:33
23107060510403619505.02017-12-28 10:15:55
34108018114486419505.02016-06-23 02:16:22
45183437911143019505.02021-05-16 17:58:14
56199544015798519504.02016-01-06 22:14:14
67116618110179419505.02018-04-15 12:04:46
78183984613610319505.02020-07-18 15:41:19
89169374713610319501.02018-11-23 01:48:57
910147853713589719504.02015-05-04 19:52:09
RatingIDUserIDWineIDVintageRatingDate
2101352621013527110619812020104.02021-07-10 07:02:24
2101352721013528127428111247504.52019-04-14 17:36:28
2101352821013529117507411202804.02020-02-22 08:31:35
2101352921013530122618417463203.52019-10-07 00:10:55
2101353021013531109610115767702.52017-04-25 00:48:56
2101353121013532201538311330203.02019-02-16 14:15:48
2101353221013533186873911144002.02018-09-30 16:47:05
2101353321013534140294714246703.02021-01-29 19:21:14
2101353421013535136035011144004.02021-07-26 14:02:14
2101353521013536119260311139305.02016-11-17 04:48:43